Chaperones increase the rate of correct folding by binding newly synthesized polypeptides beforethey are completely folded. They prevent the formation of incorrectly folded intermediates thatmay trap the polypeptide in an aberrant form . Chaperones work by binding to exposed hydrophobic patches on misfolded or incompletely folded proteins and hydrolyzing ATP. Hsp70 acts early on in the process of protein folding, binding to polypeptide chains emerging from ribosomes where there is a chain of 7 hydrophobic amino acids. The motif that is often recognized by BiP chaperone, a Hsp70 chaperone localized to the lumen of the endoplasmic reticulum, takes the form: Hy(W/X)HyXHyXHy
Hy-large hydrophobic amino acid [ Trp,Leu,Phe] W-Trp X-any amino acid
developments as therapeutics: Ligand binds to target protein but is also attached to a small peptide that is recognized by ubiquiton ligase which will be targeted towards a proteosome for degradation–> PROTACS [PROteolysis TArgeting Chimeras]
AIM: steer a protein towards a degradation pathway.
AGGREGATED PROTEINS: More specifically, we are trying to design peptides that can redirect small oligomers that are responsible for build up of proteins and formation of aggregates.
we are now looking for peptides that are able to attract chaperone proteins and more specifially the hsp70 and potentially the hsc70 becausec hsc is invoved in chaperone mediated autophagy .[Aggragates are usually clumpy and large and it would be difficult to target these aggreagtes through normal protein degradation pathways]
Future work: Lc3 adapter protein for autophagy of aggregates
Looking towards computational techniques to explore the potential peptide space.
Starts with mutating all position of the substrate ‘NRLLLTG’ to alanines: ‘AAAAAAA’–> FOLDX removes the 1st index so: ‘AAAAAA’ At each position, mutate to every other Amino Acid
Entirely deterministic
Starts with alanine substrate and mutates each index to all other amino acids. Those that are above a particular threshold(the minimum–poor choice on my part) where mutated in their second position to every other amino acid…
Starting from the binding pocket (index 432567), mutate each position to all other amino acids. Those that were above the mean of that position are considered for the next round of mutations. Results show charged amino acids flanking a hydrophobic center!
An optimization process where the aim is to improve the ability for individuals to survive( on some metric).
Representation of the Solutions: hexamer(6aa) peptides An evolutionary algorithm utilizes a population of individuals that represent a set of possible solutions.
Fitness Function: map a representation of the solution to a scalar value. In my implementation an implicit fitness remapping is use: FoldX analysecomplex Interaction Energy. The evaluation of the fitness function represents how close the solution is to the optimal solution(more later)
Initial population: 100 hexamer peptides. The goal is to ensure that the initial population is a uniform representation of the entire search space. Improvements: if prior knowledge of the search space is known, we can use it to bias the initial population Limitations: early convergence and not all search space will be explored
Sample Size: affects accuracy and convergence. Small size:Time complexity per generation is lower but may need more generations to converge. Improvements: force smaller populations to explore larger search space by increasing rate of mutation(explained later) Large size: larger area of search space but time complexity increases
Roulette wheel Selection: The fitness of the peptide influences the probability of it being selected. The chance for an individual being chosen is proportional to the fitness
\[p_i=\frac{f_i}{\sum_{i=1}^{N} f_i }\] random number generated [0,1] \(\zeta \le p_i\) –> selected
Elitism: selection of individuals from the current generation to survive to the next generation. The number of individuals that survive to the next generation without mutation is referred to as the generation gap. If gap=0: new generation is entirely new individuals Best k(0.6) individuals survive to the next generation→ to ensure that the max fitness does not decrease
Two point Cross-over : aim is to produce offspring from two parents by the selection operator. It is not necessary that each group of parents produces offspring. Crossover occurs ar a certain probaility \(p_c=0.8\) such that if \(\zeta \in [0,1] \le p_c\) then there is a crossover event. In my implementation, each crossover event results in 2 offspring. Two positions are randomly selected and the substrings between these points are swapped.Random Mutation : the aim of mutation is to introduce new genetic material into the existing indivdual –> add diversity. Mutation and crossing over are operational functions that all exploration of a large range if existing solutuons. Mutations occur at the rate \(p_m =0.3\). Usually, small values are used as the muation rate to ensure that the resulting mutated children are not distorted too much(more later). if \(\zeta \in [0,1] \le p_m\)–> mutate
y-axis:Interaction_Energy x-axis:# Pdb / peptides
y-axis: Interaction Energy x-axis: Generations
create a dataframe with all the peptides: remember there will be duplicate seq
| peptides | freq_peptides | interactionEnergy |
|---|---|---|
| RWYIPR | 2 | -18.18153 |
| RWLMPL | 1 | -18.08830 |
| VWYIPL | 2 | -18.01177 |
| VWLMPL | 1 | -17.67277 |
| RWFIEY | 2 | -17.51690 |
| EWYIEF | 1 | -17.40450 |
| RWYLHP | 1 | -17.33623 |
| FWYLMP | 1 | -17.29543 |
| EWFIEF | 1 | -17.27043 |
| EWFIDF | 2 | -17.00780 |
| MWYLMP | 3 | -16.96713 |
| EWFMEF | 1 | -16.96180 |
| RWFKEY | 1 | -16.79350 |
| VWDLRY | 2 | -16.67823 |
| VWHMPL | 1 | -16.63530 |
| FWFLMP | 1 | -16.60080 |
| EWFIGF | 1 | -16.29400 |
| VWYLEA | 1 | -16.21840 |
| VWFIEL | 1 | -16.19760 |
| RALMKY | 1 | -16.16453 |
| KCYLRY | 1 | -16.11610 |
| EWDLRF | 1 | -15.98983 |
| VWDMPL | 1 | -15.93440 |
| VWDLRP | 1 | -15.92400 |
| VWEIMP | 1 | -15.77400 |
| RILMRR | 1 | -15.76753 |
| VWYICA | 1 | -15.73077 |
| VWLMHP | 1 | -15.71137 |
| RILICL | 1 | -15.67273 |
| NWFICL | 1 | -15.67257 |
| LKHMPP | 1 | -15.65437 |
| LFHMPL | 1 | -15.57600 |
| VWDLMY | 3 | -15.57410 |
| NWFIEL | 1 | -15.49953 |
| RWDLMY | 1 | -15.47767 |
| RCFIEY | 1 | -15.42807 |
| VLKIMP | 2 | -15.38503 |
| VWFMEK | 1 | -15.37583 |
| RAYMPR | 1 | -15.37110 |
| VWYLCA | 1 | -15.36723 |
| VWFKEP | 1 | -15.36190 |
| MILIMP | 1 | -15.29850 |
| VAYMPL | 1 | -15.28110 |
| RAFMPR | 1 | -15.22777 |
| VWFMCA | 1 | -15.22103 |
| VWFMGL | 1 | -15.19027 |
| AFYIDF | 1 | -15.15043 |
| RILLHR | 1 | -15.07613 |
| FWYLET | 1 | -15.07257 |
| VMHIPL | 6 | -15.05593 |
| MILMMP | 1 | -14.97693 |
| RILMCL | 1 | -14.97257 |
| RCYIGY | 1 | -14.88833 |
| NWFMGL | 1 | -14.86623 |
| VMHMPL | 1 | -14.86353 |
| RILMHR | 2 | -14.81973 |
| VWDLMP | 1 | -14.80067 |
| VMHMFY | 2 | -14.75083 |
| TMHMPL | 2 | -14.72100 |
| RCFKPL | 1 | -14.65093 |
| VWDMMP | 2 | -14.64550 |
| NWYLEC | 1 | -14.61237 |
| VILMCL | 2 | -14.58650 |
| VMHMFP | 2 | -14.15240 |
| VWDLEA | 1 | -14.12530 |
| KFDLHY | 1 | -13.97017 |
| VWDLEC | 1 | -13.46173 |
| RCDLRY | 6 | -13.38167 |
| NMHICL | 1 | -13.16787 |
| CRLMDT | 4 | -13.05393 |
| VMDMPL | 1 | -12.57067 |
| peptides | freq_peptides | interactionEnergy |
|---|---|---|
| DPRLFPW | 1 | -16.50707 |
| QFMMFPY | 2 | -16.21813 |
| QLMMFPD | 1 | -16.06513 |
| ILMMFPD | 1 | -16.00513 |
| VMPLFPR | 1 | -15.74783 |
| QFLLPIY | 1 | -15.71213 |
| NPLLPII | 1 | -15.58663 |
| LMPLFIR | 1 | -15.53827 |
| NPMMFPI | 1 | -15.53333 |
| VMPLFIR | 1 | -15.46510 |
| DPFLFPW | 1 | -15.38390 |
| LFLIIFV | 5 | -15.22323 |
| DFRMTPW | 1 | -15.06197 |
| VDPLFPR | 1 | -15.02960 |
| EIPLFPA | 1 | -14.98060 |
| NLRLWIY | 1 | -14.96173 |
| VLMLFTR | 2 | -14.93977 |
| DIPLFPW | 1 | -14.87653 |
| APFIFIV | 1 | -14.75227 |
| KVRLWIY | 5 | -14.70183 |
| QAMMFPD | 1 | -14.65450 |
| QKLVFPD | 1 | -14.61573 |
| LKPLFIV | 1 | -14.58803 |
| QPRLWID | 3 | -14.53497 |
| QLMMIYD | 1 | -14.47693 |
| WVSMFPR | 1 | -14.43687 |
| AMPLFID | 1 | -14.40473 |
| KARLWIY | 1 | -14.40053 |
| EWFMNYI | 1 | -14.32020 |
| LGRIIFV | 1 | -14.30933 |
| QWGMPIY | 1 | -14.27170 |
| WDPLFPA | 1 | -14.26513 |
| WEILWIY | 1 | -14.25403 |
| QFLKPIY | 1 | -14.24707 |
| VAPLFIR | 2 | -14.20760 |
| KMRKILY | 1 | -14.17833 |
| APFMWPV | 1 | -14.12367 |
| QLRVIFD | 1 | -14.08600 |
| CERLWIH | 1 | -14.05417 |
| VDPLFIR | 1 | -14.03677 |
| KVRMNLH | 1 | -14.03007 |
| KLRKWIY | 1 | -14.02397 |
| LKLVIFV | 1 | -14.00563 |
| KPFMSYI | 1 | -13.95157 |
| QLFMSYD | 1 | -13.87347 |
| DARLTPW | 1 | -13.83487 |
| KARLSYI | 1 | -13.81090 |
| LRLVIFV | 1 | -13.80523 |
| WDPKFPA | 1 | -13.78527 |
| ILRVIFD | 2 | -13.75590 |
| DLRKTPW | 1 | -13.74073 |
| NARLWIY | 1 | -13.64490 |
| WPIMNLH | 2 | -13.59007 |
| DAPLFPW | 1 | -13.58643 |
| WEIMNLH | 4 | -13.57550 |
| CSMMIFQ | 1 | -13.57027 |
| NARLWLY | 1 | -13.55043 |
| DFRKTPW | 1 | -13.43020 |
| QEMLWID | 4 | -13.39407 |
| PFLKYIK | 1 | -13.37763 |
| KAFLSYI | 3 | -13.13740 |
| KEIMNYI | 2 | -13.09220 |
| KARKWIY | 1 | -12.49083 |
| QARVIFD | 2 | -12.31050 |
| IARVIFD | 1 | -12.29057 |
| WDPMSYI | 1 | -12.26897 |
| CSMKIFQ | 1 | -12.24807 |
| NAMMSYI | 2 | -12.19043 |
| CSFMSFQ | 1 | -12.15817 |
| PGRKYLK | 1 | -12.05280 |
| PALKYLK | 2 | -11.97247 |
| WARLSLA | 1 | -11.74531 |
| PGRVIFD | 1 | -11.50383 |
| peptides | freq_peptides | interactionEnergy |
|---|---|---|
| YFLMLP | 1 | -16.29440 |
| FMFMPL | 2 | -16.04083 |
| MILMHF | 1 | -15.94077 |
| YWHMLL | 1 | -15.72873 |
| FFMMLL | 1 | -15.70983 |
| LKHMLF | 1 | -15.66133 |
| FFFMML | 1 | -15.61387 |
| SFLMLL | 2 | -15.32073 |
| RLMMHL | 1 | -15.14853 |
| QWYLRP | 1 | -15.13430 |
| MWYMDL | 1 | -15.09993 |
| SMFMLL | 1 | -15.02673 |
| FFHMLL | 1 | -14.99177 |
| FCFMQY | 2 | -14.98643 |
| MIFMGF | 1 | -14.98173 |
| VFLMGF | 2 | -14.97930 |
| LWIIPL | 1 | -14.92187 |
| FKYMLL | 1 | -14.89593 |
| MIFMGY | 2 | -14.85770 |
| RMFMGY | 1 | -14.85200 |
| RCFMRY | 1 | -14.82190 |
| FFHMML | 1 | -14.82007 |
| LWIIMP | 6 | -14.81893 |
| LKEIMF | 1 | -14.79217 |
| LKEILF | 3 | -14.77560 |
| LKHMLL | 1 | -14.72487 |
| RLHMHL | 1 | -14.68263 |
| DFHMPL | 1 | -14.67963 |
| RFLMHP | 1 | -14.63247 |
| YWEIMP | 1 | -14.61420 |
| FSLMLL | 1 | -14.61133 |
| YFLMHP | 1 | -14.56273 |
| MKYLRP | 1 | -14.52503 |
| SFHMPL | 2 | -14.49313 |
| SFHMLP | 1 | -14.46367 |
| MFHLRP | 1 | -14.45827 |
| FCFMLL | 3 | -14.44907 |
| SILMTY | 1 | -14.43263 |
| MIPKLF | 1 | -14.42687 |
| SFHMMP | 1 | -14.39320 |
| VFFMMG | 2 | -14.38337 |
| GWIIMY | 4 | -14.37040 |
| LIHIMP | 1 | -14.34927 |
| RILMAP | 1 | -14.34577 |
| GWYMDL | 1 | -14.28900 |
| DCFMPL | 1 | -14.21483 |
| SFHMLL | 1 | -14.20200 |
| MLMMHS | 1 | -14.17323 |
| VKEIMF | 1 | -14.13463 |
| LKYLRP | 1 | -14.09337 |
| SWPLLL | 4 | -14.04427 |
| LFHMLG | 2 | -14.02817 |
| RFHMGY | 1 | -14.02390 |
| RILMHP | 1 | -13.97610 |
| LKEILP | 1 | -13.95940 |
| MIPLHF | 1 | -13.92307 |
| FMFMGL | 1 | -13.91030 |
| LKEIMP | 1 | -13.85153 |
| NWLMCT | 2 | -13.83833 |
| EVHKLF | 1 | -13.79123 |
| MILMCT | 1 | -13.74703 |
| GKYLRY | 1 | -13.73467 |
| NWPLTF | 1 | -13.70810 |
| RCFMGY | 1 | -13.66923 |
| RLMMDS | 1 | -13.60943 |
| FCFMGY | 1 | -13.56093 |
| FFHMGL | 1 | -13.12447 |
| SCHMLL | 1 | -13.09980 |
| RCFMCL | 1 | -13.02253 |
| RCHMGY | 1 | -12.82293 |
| SCFMGY | 1 | -12.78020 |
| LKYLGP | 2 | -12.59557 |
| VILMTS | 1 | -12.47337 |
| DCEIPL | 1 | -12.30853 |
| VFPMGS | 1 | -10.99050 |